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ABSTRACT 

Thirty four years ago, it was postulated that natural populations of Drosophila melanogaster are 
comprised of two behavioral morphs termed "rover" and "sitter", and that this variation is caused 
mainly by large-effect alleles at a single locus. Since that time, considerable data has been 
amassed that compares the behavior and physiology of these morphs. Contrary to common 
assertions, however, published support for the existence of common large effect alleles in nature 
is quite limited. To further investigate, we quantified the foraging behavior of 36 natural strains, 
performed a genome-wide association study, and described patterns of molecular evolution at the 
foraging locus. Though there was significant variation in foraging behavior among genotypes, 
this variation was continuously distributed and not significantly associated with genetic variation 
at the foraging gene. Patterns of molecular population genetic variation at this gene also provide 
no support for the hypothesis that for is a target of long term balancing selection We propose that 
additional data is required to support a hypothesis of common alleles of large effect on foraging 
behavior in nature. Genome-wide association does support a role for natural variation at several 
other loci, including the sulfateless gene, though these associations should be considered 
preliminary until validated with a larger sample size. 

INTRODUCTION 

Sokolowski (1980) described a now classical difference in behavior between two mutant 
strains of Drosophila melanogaster. When placed in a food patch, larvae of one strain would stay 
there and eat ("sitters") while the larvae of the other strain would crawl around while eating and 
visit other food patches ("rovers"). It was later determined that the difference between a strain 
with the sitter trait and a strain with the rover trait could be explained mostly by variation at a 
single locus containing the foraging gene, and that transgenic manipulations of this gene 
phenocopied this variation (Osborne et ah 1997). Though the original variation was described in 
strains that had been reared in the lab for decades, variation in foraging behavior was also found 
in natural populations, and a hypothesis was put forth that this behavioral variation is maintained 
by balancing selection in the wild (Sokolowski 1980, Bauer and Sokolowski 1984, Sokolowski, 
Pereira and Hughes 1997). In the 34 years since the original description of this variation, a 
considerable research effort has been focused on "reference" rover and sitter strains, thought to 
be representative of two natural morphs. These reference strains have been repeatedly compared 
and found to differ in many traits in addition to larval foraging, including adult movement 
patterns (Pereira and Sokolowski 1993, Edelsparre et ah 2014), energy storage (lipid vs. 
carbohydrates, Kent et ah 2009), glucose homeostasis (Kaun, Chakaborty-Chatterjee and 
Sokolowski 2008), thermotolerance (Chen et ah 201 1), resistance to sleep deprivation (Donlea et 
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al. 2012), relative strength of short term vs. long term memory (Kaun et al. 2007, Mery et al. 
2007), use of "retroactive inference" (Reaume, Sokolowski and Mery 2011), use of public vs. 
private information (Foucaud et al. 2013), and more. The picture painted is one in which D. 
melanogaster populations are composed of two coexisting behavioral "morphs", differentiated in 
many phenotypic dimensions, that coexist through some type of balancing selection. The only 
comparable case we are aware of is variation at the npr-1 gene in Caenorhabditis elegans 
(though there may be non-recombining supergenes with such diverse effects [Lawson, Vander 
Meer and Shoemaker 2012; Thomas et al. 2008]). Similar to the foraging case in D. 
melanogaster, the npr-1 allele from the N2 strain was found to cause solitary foraging, in 
contrast to the group foraging observed in other strains (de Bono and Bargmann 1998). It has 
recently been postulated, however, that this large-effect allele arose as an adaptation to lab 
culture, and may not be maintained by selection in natural populations (Rockman and Kruglyak 
2009; McGrath et al. 2009). We feel that, based on published data, this same scenario cannot be 
ruled out in the case of variation at foraging. We first critique the existing evidence supporting 
the "common alleles of large effect" hypothesis, and then present additional data which fails to 
support this hypothesis. 

Evidence for bimodal trait distributions in natural populations 

If the distribution of trait variation is bimodal, this supports a hypothesis that there is a factor of 
large effect involved (genetic or environmental). In the case of foraging behavior in D. 
melanogaster, several bimodal distributions have been reported. An impressively bimodal 
distribution is shown in a highly cited review that features the foraging story (Sokolowski, 
2001). The source of these data is not reported in the review, but the distribution appears 
identical to figure 1 from de Belle, Hilliker and Sokolowski (1989). If this is indeed the source of 
these data, this bimodality does not address whether there is a common factor of large effect in 
natural populations. The bimodal distribution in de Belle, Hilliker and Sokolowski (1989) is a 
comparison of only two "reference" strains: the EE (ebony 11 ) strain (one mode) and the B15 
strain (the other mode; see below for a further description of these genotypes). The variance 
around each mode is variation among replicate individuals of the same genotype. 

A bimodal distribution is also reported in a note published by the Drosophila Information 
Service (Sokolowski, 1982). This distribution shows variation in foraging behavior in larvae 
collected from fallen pears near Toronto. The distribution of foraging path lengths for these 
larvae is bimodal, with one mode between 60-80 mm (named "rovers") and the other mode at 0- 
20 mm (named "sitters"). The bimodality of this distribution supports a hypothesis that there is a 
factor of large effect in the data, but this factor is not necessarily genetic. These larvae were 
sampled from nature and assessed directly, so variation among them could be due to genes, the 
environment, and/or gene-by-environment interactions. Indeed, major effects of environmental 
factors were later quantified in laboratory assays (Graf and Sokolowski. 1989), and it was later 
noted that "a carefully controlled environment is required to minimize the phenotypic overlap 
between distinct genotypes" (de Belle, Hilliker, and Sokolowski, 1989). It therefore seems 
plausible that the alternative possibility— individuals of the same genotype behaving differently 
due to environmental influence— cannot be excluded. Indeed, in Sokolowski (2001), it is stated 
that "flies with rover alleles can be made to behave as sitters after a short period of food 
deprivation" . 

Evidence for a genetic effect on foraging distance was reported in the original foraging 
paper, but not for genotypes recently sampled from nature (Sokolowski 1980). Larvae of one 
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strain, carrying a white 00 mutation affecting eye color, traversed a much larger area when 
feeding ("foraging") than the other strain, which carried a mutant ebony 11 allele affecting body 
color. This behavioral difference did not seem to be caused by either of the pigmentation 
mutations, as ebony is on the third chromosome, white is on the X, and the behavioral difference 
mapped almost entirely to the second chromosome. Individuals with the second chromosome 
from the ebony 11 strain were termed "sitters", and those with the white blood second chromosome 
were termed "rovers". Though given the same names as the larvae sampled from nature, no 
evidence is reported that supports this inference. Differences between these lab stocks could have 
arisen during lab culture. Flybase.org lists the first reference to the ebony 11 allele as a paper from 
1926 (Stern, 1926): 54 years of lab culture is likely equivalent to over 1000 generations since 
this strain was sampled from nature. The earliest reference we could find to the white blood strain 
was published in 1945 (Ephrussi and Herold, 1945), which is also a considerable interval. 

The strongest evidence for the existence of a common allele of major effect in nature, to 
the best of our knowledge, was presented by Sokolowski, Pereira, and Hughes (1997). An 
outbred laboratory population was founded from 500 flies, again collected from a Toronto 
orchard, and 500 individual larvae were assayed from this population after a year of lab culture. 
The trait distribution of these individuals was once again bimodal, (figure 1, Sokolowski, 
Pereira, and Hughes, 1997), and appears very similar to the distribution of larvae measured 
directly from nature (Sokolowski, 1980; Sokolowski, 1982). Moreover, the differences between 
"rover" and "sitter" larvae from this population failed to complement the sitter mutation from the 
ebony 11 lab strain, suggesting that there was variation at the same locus. Together, these data do 
support a hypothesis of common, large-effect alleles in nature. However, these data on their own 
still allow room for doubt, especially in light of the unprecedented nature of the foraging story. 
Possible critiques of these data include the complications of epistasis when performing 
complementation tests in uncontrolled genetic backgrounds (Service, 2004). In addition, the 
genotypes found to be bimodal were not isolated independently from nature. Instead, a large 
outbred population was adapted to the lab for a year, then individuals from this population were 
assayed. This means that allele frequencies could evolve between the sampling and 
measurement, so that a rare sitter allele (for example) could become more common. Indeed, the 
same paper proposes that these alleles can rapidly change in frequency in lab culture due to 
density-dependent selection. A more accurate procedure for investigating the frequency of 
natural genotypes in Drosophila is to sample individual females and use them to found 
independent isofemale or inbred lines. This method was used by Bauer and Sokolowski (1984), 
who show little support for the common alleles of large effect hypothesis. Fifteen fertilized 
females were collected from nature (source population not stated), and an isofemale line was 
started from each female's descendents. Path lengths were found to vary significantly among 
these 15 lines, implicating heritable (likely genetic) variation. However, the trait values of all 
lines shown would lead to their classification as rovers based on the previous definition. The data 
presented appear to vary continuously from -50-85 mm, without two modes (based on visual 
inspection of figures 1 and 3 from Bauer and Sokolowski 1984). Thirteen of these lines were not 
classified as either rover or sitter, but the line (Bl) with the shortest path length was classified as 
a sitter. This appears arbitrary, as the behavior of this line appears more similar to the "roving" 
white biood stock than to the "sitting" ebony 11 stock (Sokolowski, 1980). The strain with the 
longest path length (B15) was designated a "rover". The Bl and B15 lines were referred to as 
"extreme", but they do not appear to be statistical outliers compared to the other 13 lines (figure 
1, Bauer and Sokolowski 1984). The primary support for a hypothesis that the genetic difference 
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between Bl and B15 was related to the difference between white 00 and ebony was that the 
mutation(s) causing the difference between the Bl and B15 strains were later mapped mainly to 
the second chromosome (Bauer and Sokolowski 1985). This is consistent with a large-effect 
allele, but it should be noted that the second chromosome comprises -2/5 of the euchromatic D. 
melanogaster genome, and could therefore harbor causal variants in many genes (Adams et al, 
2000). The strain with the longest path (B15) appears to be the strain that was later used as the 
"reference" rover strain, renamed the for R strain (sometimes also referred to as R, B15B15 or 
BB). The strain with the shortest path, however, does not appear to be the reference sitter strain 
(EE, S, or for s ) that was used in subsequent mapping experiments. Instead, strains with the 
second chromosome from B15 were compared to strains with the second chromosome from the 
ebony 11 lab stock in later work (see below). 

Other genotypes have also been compared, but these cases do not support the common 
allele of large effect hypothesis either. For example, artificial selection was performed on a sepia 
mutant stock collected by Timothy Prout near the University of California, Riverside, and kept in 
the lab for 15 years before the start of the experiment (Sokolowski and Hansell, 1983). The trait 
values of this stock before selection would classify them as "rovers" based on the previous 
definition (Sokolowski 1980, Sokolowski 1982), as the average path lengths were between 60 
and 110 mm (Sokolowski and Hansell, 1983; Sokolowski, Hansell, and Rotin, 1983). Selection 
was successful in both directions, indicating that there was heritable variation in path length in 
this stock (Sokolowski, Hansell, and Rotin, 1983). However, this is not evidence that the stock 
was segregating a "sitter" allele of large effect. After selection, in fact, the population selected to 
be sitter-like would still have qualified as "rovers" based on the definitions provided (visual 
inspection of figure 3 from Sokolowski, Hansell, and Rotin [1983] suggests a range of -40-80 
mm after selection). These data are therefore consistent with multiple alleles of small or 
moderate effect. 

Additional collections were made from the Toronto pear orchard in 1986, and laboratory 
stocks collected from different regions of a single pear were found to differ in average path 
length (Sokolowski et al, 1986). Only the mean values are shown, however, and no evidence of a 
bimodal distribution is presented. Stocks with the second and third chromosomes from the 
ebony 11 strain (now called EE) and white blocd (now WW) stocks (and stocks derived from them) 
were reassayed for comparison to this fresh collection, and the newly sampled short-path-length 
strains had intermediate trait values compared to the EE and WW lab strains (figure 2 from 
Sokolowski et al, 1986; note the difference in scale between 2a and 2b). The shorter path strains 
were called sitters, and the longer path strains were called rovers, but this appears arbitrary 
compared to previous definitions. The only additional report of a bimodal distribution we are 
aware of is for a trait correlated with larval foraging distance, but this appears to be a truncation 
artifact, as flies given a larger area displayed continuous variation with a single mode (figure 2, 
Nagle and Bell 1987). Nearly all other work on rover and sitter that we are aware of focused only 
on the "reference" strains, discussed further below, and therefore cannot speak to the frequency 
of foraging alleles in nature. We therefore feel that these data, though they raise the interesting 
possibility of common behavioral morphs in nature, are insufficient to strongly support such a 
hypothesis. 

Connecting genotype and phenotype 

Regardless of whether or not there are common alleles of large effect at the foraging gene in 
natural populations, considerable evidence connects variation at foraging to variation in some 
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behaviors. Efforts to connect genotype and phenotype have focused nearly exclusively on three 
reference strains: for, for , and for . This is understandable, due to the labor-intensive nature of 
behavioral characterization, but limits inference regarding natural populations. First, it is not 
clear that the for strain used has an allele recently isolated from nature, as is sometimes asserted 
(Kent et al. 2009; Edelsparre et al. 2014). Efforts to map the genetic basis of path length to a 
sub-chromosomal level were first published in de Belle, Hilliker, and Sokolowski (1989): the 
sitter allele used in these efforts is the one isolated from the ebony 1 lab stock. The sitter strain, 
before being renamed for , is described as "EE". Based on Sokolowski (1980) and Sokolowski et 
al. (1986), this strain seems to have the second and third chromosomes, and therefore the 
foraging allele, from the ebony 11 stock. The rover strain, renamed for R , is also referred to as 
B15B15, and therefore appears to have the second and third chromosomes from the B15 strain 
(Bauer and Sokolowski 1984). This B15 or R stock is referred to as for R by de Belle, 
Sokolowski, and Hilliker (1993), and appears to have become the reference rover strain. 
Supporting this inference is the common reference to the for R strain as the parent of the irradiated 
strain for' 2 , produced from B15B15 by de Belle, Hilliker, and Sokolowski (1989). However, de 
Belle, Sokolowski, and Hilliker (1993) also clearly indicate that the strain renamed for is the EE 
strain, which has the foraging allele from the ebony 11 stock. The final identification of the gene 
dg2 (a cGMP-dependent protein kinase, subsequently renamed foraging) as the gene harboring 
the for R and for alleles does not state the source of the for allele, but the ebony 11 stock seems 
likely as it was a continuation of the earlier mapping work, and uses the same stock name 
(Osborne et al., 1997). Other sitter- like chromosomes that failed to compliment the ebony 1 allele 
have also been referred to as for" in the literature (Sokolowski, Pereira, and Hughes 1997), so it is 
possible that other alleles were used. 

Regardless of the source of the for" allele, the for R and for strains were created by 
chromosome extraction (Sokolowski 1980; Sokolowski et al. 1986; de Belle, Sokolowski, and 
Hilliker, 1989), and therefore differ at a very large number of loci. As a result, it is not possible 
to associate traits with foraging alleles simply by measuring the phenotype of these strains (e.g. 
MacPherson et al. 2004; Foucaud et al. 2013). Thankfully, most studies also use the for 2 strain 
(Kaun et al. 2007; Mery et al. 2007; Kaun, Chakaborty-Chatterjee & Sokolowski 2008; Kent et 
al. 2009; Chen et al. 2011). This sitter-like strain was created from the reference rover strain 
using high doses of gamma radiation (5000 rad; de Belle, Hilliker and Sokolowski 1989). This 
strain therefore shares a common background with for R , though it seems likely to have secondary 
mutations, despite assertions that it differs from for R only at foraging (Mery et al. 2007; Reaume, 
Sokolowski and Mery 201 1; Edelsparre et al. 2014). We note, however, that the association with 
larval foraging behavior is supported by more substantial evidence (Osborne et al., 1997), and 
some trait associations have additional support from transgenic manipulation (Kaun et al. 2007; 
Mery et al. 2007; Donlea et al. 2012; Edelsparre et al. 2014). To our knowledge, however, the 
genome sequences of the reference strains have not been reported. 

It therefore seems that, although variation at the gene foraging can clearly affect 
interesting behaviors, additional evidence is required to support a hypothesis of large-effect 
alleles at high frequency in nature, which affect many aspects of physiology and behavior, and 
thereby create two behavioral "morphs" maintained by balancing selection. To collect additional 
data, we have quantified larval foraging behavior on a set of inbred lines isolated from a wild 
population and attempted to associate trait variation with genotype. Our analysis fails to support 
the common-alleles of large effect hypothesis nor were we able to map a causative nucleotide 
difference segregating at the for locus. Our analyses of patterns of variation at the locus also do 
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not support that hypothesis that foraging is a target of balancing selection in the Drosophila 
genome. 

METHODS 

Fly strains. The reference for , for s , and for 2 strains were provided by Maria Sokolowski. We 
also used inbred lines from the Raleigh (RAL) collection, provided by the Bloomington 
Drosophila Stock Center. These lines were collected by the Mackay lab in Raleigh, North 
Carolina, and each line underwent full-sibling inbreeding for 20 generations to eliminate most 
genetic variation (Ayroles et al. 2009; Mackay et al. 2012). All lines were maintained in 25x95 
mm vials on molasses medium in standard Drosophila incubators, at 25°C under a 12-hr 
light/dark cycle. 

Behavioral measurement. The rover/sitter phenotype was measured for each line similarly to 
previously published methods (deBelle et al. 1987). Oviposition bottles were prepared with 
grape juice agar plates with yeast paste added. Parents were allowed to lay eggs freely for 1 hour 
at 25°C, after which the plates were incubated overnight at 25°C. Twenty-four hours later, 
individual newly hatched larvae were carefully removed from the oviposition plates using 
forceps and transferred to Petri plates containing 35 mL standard Drosophila food. No more 
than 50 larvae were grown in each plate to avoid overcrowding. They were allowed to grow in 
these plates until 96 hours post-hatching. At this time, each third instar larva was removed from 
the food and individually tested for foraging behavior. Only larvae found within the food were 
tested. Any larvae on the food surface or on the surface of the Petri plate were not used. For the 
behavior test, a thin layer of yeast paste was applied to a custom-built plastic plate using a rubber 
squeegee. A larva was placed into the middle of the yeast paste, the yeast paste was covered with 
half a Petri plate, and the larva moved freely for five minutes. After this time, the larval trail, 
visible in the yeast paste, was traced by pen onto the covering Petri plate. The marked plate was 
photographed, and the length of the larval path was measured using the ImageJ software 
package. Sample sizes per line varied from 28-81 (median n=54); raw data are available as 
supplementary data. The R software package was used to further analyze the results. 

Genome- wide association study. Associations were conducted using the logio transformed 
mean path length for each line. Transformed trait values for the 36 lines were uploaded in 
December 2013 to the Drosophila Genetic Reference Panel (DGRP) webtool available at 
dgrp.gnets.ncsu.edu. This tool, created by the Mackay lab (Mackay et al. 2012) performs 
associations between 1,303,322 genetic variants and trait vales using ANOVAs of the form Y= ^ 
+ M + s, where M is the effect of a genetic variant. Associations were done with only 35 of the 
36 genotypes, as no genome was available for strain RAL-765. Full results are available as 
supplementary data. 

Molecular Population Genetics. All population genetic analysis was performed on the complete 
data from Mackay et al. (2012). To test for selection acting at the for locus we performed a 
lineage-specific McDonald-Kreitman test (McDonald and Kreitman 1991) and compared results 
from that test to lineage-specific McDonald-Kreitman tests from every other protein coding locus 
throughout the genome using D. simulans w501 and D. yakuba genomes (Begun et al. 2007) as 
outgroups for polarization of fixed differences. A similar test was also performed using 
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unpolarized comparisons. For summary statistics of patterns of polymorphism and divergence we 
calculated nucleotide diversity (Tajima 1983), Tajima's D (Tajima 1989), and the ratio of 
nucleotide diversity to divergence on the D. melanogaster lineage ignoring repetitive regions 
masked by RepeatMasker (repeatmasker.org). These statistics were also calculated on 
synonymous and nonsynonymous sites separately. For nearly every summary statistic, we 
calculated the same statistic for the longest transcript from each protein coding locus throughout 
the genome to establish the genome-wide empirical distribution from which to compare 
population genetic summaries at for. All population genetic analysis was performed using scripts 
written in Ruby or Python and are available upon request. 

RESULTS 

Foraging path lengths were measured in the reference rover and sitter strains as well as 36 
natural isolates of D. melanogaster from the RAL collection (figure 1). These strains were 
captured in Raleigh, North Carolina, and underwent full-sib mating for 20 generations (Ayroles 
et al. 2009). This procedure produces an inbred genotype while allowing very little opportunity 
for selection (though some alleles, like recessive lethals, are purged). 

The for and for 2 strains were found to have shorter average path lengths (1.30 and 1.71 
cm) than the for R strain (2.66 cm), as expected. The foraging paths of the for strain were 
significantly shorter than the for 2 strain (t-test p=0.002), and both are significantly shorter than 
the for R strain (t-test p=1.8E-6 and p=0.016, respectively). All reference strains traveled shorter 
distances compared to published data. In Pereira and Sokolowski (1993), for example, for 
larvae average 10.89 cmand/o/ 17.83 cm in the same period of observation. 

Data from the 36 inbred RAL strains are distributed with only one mode. The mean path 
length among lines is significantly non-normal (Shapiro-Wilk's Test p=0.015), but not after logio 
transformation (Shapiro-Wilk's Test p=0.075). Only three strains had shorter paths than the for 
strain, while only five strains traveled farther than the for strain: the majority of strains fell 
between these two reference points (figure 1). 

To investigate whether any of the phenotypic variation among RAL strains may be due to 
alleles at the foraging (for) gene, we performed a genome-wide association study, and also an 
association study at the for gene only. In the latter case, the a priori expectation of an association 
at this locus reduces the loss of power from multiple testing. As annotated at flybase.org 
(genome assembly version 5.54) the for gene is found on the left arm of chromosome 2 from 
base pairs 3,622,074-3,656,953. We analyzed the 607 single nucleotide polymorphisms (SNPs) 
that are mapped to this interval in our 35 sequenced lines; 5,000 bp to either side were also 
included as potential regulatory regions. The most significant association in this interval had a p- 
value of only 0.018, uncorrected for multiple tests (figure SI). With 607 parallel tests, this 
variant is not close to the Bonferroni-corrected significance value of 8.24E-5. Less conservative 
corrections also fail to support an association at this locus : there is no excess of low p-values in 
this interval compared to the null expectation (figure S2). 

At the genomic level, there is a slight excess of low p-values compared to the neutral 
expectation (figure S2 and figure 2). No SNPs are significant after Bonferroni correction for 
1,303,332 tests. However, there are 4 SNPs with p-values less than 1E-6, where only one is 
expected by chance (false discovery rate estimate of 25%). All four of these SNPs are within or 
adjacent to genes that could plausibly affect foraging behavior. The most significant SNP 
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(p=8.14E-8) is in an intron of the gene sulfateless (figure 3). Sulfateless is the homolog of the 
mammalian enzyme glucosaminyl N-deacetylase-N-sulfotransferase, involved in the synthesis of 
heparan sulfate (Ren et al. 2003, Kamimura et al. 2013). Loss of function mutations at 
sulfateless in D. melanogaster cause locomotory defects in larvae by altering synaptic 
transmission at the neuromuscular junction (Ren et al. 2009). The second most significant SNP 
(p=T.39E-7) is in an intergenic region between HSP60B and Excitatory amino acid transporter 2 
(EaatT). Eaatl is expressed in both the central and peripheral nervous system of third instar 
larvae (Besson et al. 2011). Eaat2 is a glutamate:sodium symporter, and glutamate is the 
excitatory neurotransmitter at the neuromuscular junction (Jan and Jan 1976). Though loss-of- 
function mutations at Eaatl do not seem to impair larval locomotory performance, these 
mutations do affect how larvae respond to chemosensory stimuli such as salt and propionic acid 
(Besson et al. 201 1). If the putatively significant SNP near Eaatl affected the expression of this 
gene, it could plausibly alter how larvae sense and respond to food in their environment. The 
third most significant SNP (p=2.28E-7) is in the 3' UTR of the gene Shaker cognate w (Shaw). 
Shaw is a voltage-dependent potassium channel in the Kv3 family expressed broadly in the 
nervous system, and affects neuronal excitability when perturbed (Hodge et al. 2005). This SNP 
is only 70.4 Kb from the for gene — though there are 27 other genes annotated between Shaw and 
for. Finally, the fourth most significant SNP (p=7.26E-7) is within an intron of the 
uncharacterized gene CG32204. Fly Atlas gene expression data indicate that this gene is strongly 
expressed in the larval and adult nervous system, with little expression detected in other tissues 
(Chintapalli et al. 2007). 

Patterns of Polymorphism and Divergence 

To investigate whether there is any signature of balancing or frequency dependent 
selection acting at the for locus (Fitzpatrick et al. 2007), we examined patterns of polymorphism 
and divergence at for using a large set of genome sequences derived from a North American 
population of D. melanogaster (Mackay et al. 2012). If some flavor of balancing selection was 
acting at this locus we would expect to see elevations in polymorphism relative to divergence 
either throughout the locus or in a particular region of the locus that is the target of such 
selection. The for locus, which encodes a rather long protein, has 18 nonsynonymous 
polymorphisms and 76 synonymous polymorphisms in the DGRP sample. Accounting for the 
total length of the longest transcript at for this amounts to 0.0063 nonsynonymous sites per base 
pair (56 th percentile among all genes) and 0.028 synonymous sites per base pair (89 th percentile). 
Nucleotide diversity paints a rather similar picture. Locus-wide 71=0.0072, which is greater than 
median diversity but only at the 84 th percentile among all genes. Nucleotide diversity at 
synonymous and nonsynonymous sites appears no different; tc n = 0. 00061 (59 th percentile 
genome-wide) and 7is=0.017 (91 th percentile genome -wide). Thus while for shows above average 
diversity, it does not seem to be a strong outlier as might be expected from an ancient balanced 
polymorphism. 

To formally test the neutral model at the for locus we looked at the site frequency 
spectrum of polymorphism and comparisons of polymorphism and divergence. Under a model of 
balancing selection one expects to see a skew in the site frequency spectrum towards 
intermediate frequency polymorphisms. Using Tajima's D statistic, this would translate into a 
strongly positive value of D. At for we find that D = 0.6074, which when tested against 
coalescent simulations under the standard neutral model yields a p-value of p = 0.204 in a one- 
side test. Further, compared to the genome-wide empirical distribution of D statistics for protein 
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coding genes for is at the 74 percentile. This observation suggests that the site frequency 
spectrum at for is not particularly unusual or strongly skewed towards intermediate frequencies. 

Another strong prediction under a model of balancing selection is that the ratio of 
polymorphism and divergence should be skewed in favor of too much polymorphism relative to 
divergence under neutrality. We tested this prediction in three different ways. First we used an 
unpolarized McDonald-Kreitman (MK) test to examine patterns of polymorphism and 
divergence at synonymous and nonsynonymous sites at for using D. simulans as an outgroup. 
The unpolarized MK test (Table 1) is non-significant (p=0.30; Fisher's exact test) and has a 
Neutrality Index (N.I.) of 0.64 (Rand and Kahn 1996). Comparing the N. I. of for to the 
empirical distribution from all protein-coding genes in the genome, we find that for is near the 
middle of the distribution (31 st percentile). The second test of polymorphism and divergence we 
performed was a polarized MK test where we examined only fixations along the D. 
melanogaster lineage. The results are qualitatively similar in this case (p=0.26; N.I. =0.53; 16 th 
percentile). Thus neither polarized nor unpolarized MK tests can reject the neutral model. 

Finally we were interested in examining the ratio of polymorphism and divergence 
directly from the entire locus and from individual windows in a Hudson-Kreitman-Aguade like 
test (Hudson et al. 1989). The ratio of nucleotide diversity to divergence (again using only 
fixations along the D. melanogaster lineage) for the whole for locus is equal to 7i/div=0.419. 
Comparing this ratio to that of all proteins throughout the genome, for is on the higher-end of the 
empirical distribution but not a strong outlier (91 st percentile). We also examined sliding 
windows of the ratio of polymorphism to divergence (71/div) and the results from this analysis are 
shown in figure 4. This analysis revealed perhaps the only remarkable feature of genetic 
variation in the for region. While the ratio of polymorphism to divergence throughout the locus is 
below one and rather average for the genome, there is a strong peak in polymorphism relative to 
divergence approximately 8kb 3' of the for locus. Unfortunately this peak does not overlap any 
known regulatory sequences of for and moreover there are two proteins coding genes between 
this peak and the end of the for coding sequence. Thus while this may be an intriguing finding 
we have no ability to say whether this is associated with for or not. After observing that this 
region was unusual, we checked the GWAS data to see if any variants in this interval were 
associated with foraging behavior. Thirty-eight SNPs in this interval (3,613,450 - 3,613,450) 
were included in the GWAS. The most significant association of these is a G/T polymorphism at 
6,614,038 (p=0.0047; p = 0.16 after correcting for 38 parallel tests). 

DISCUSSION 

There is little doubt that decades of work on variation at the gene foraging has proven 
interesting. Considerable mapping efforts cumulated in the connection of variation in larval 
behavior to the gene foraging (Osborne et al., 1997), and this catalyzed additional research into 
this interesting gene in other systems (Ben-Shahar et al. 2003). The prevalence of large-effect 
alleles in nature, however, is not well established, despite published claims that D. melanogaster 
populations are comprised of 70% rovers and 30% sitters (Osborne et al 1997; Sokolowski 
2001). Published data are subject to alternative interpretation due to uncontrolled environmental 
variation (Sokolowski 1980; Sokolowski 1982) or mass culture (Sokolowski, Pereira & Hughes 
1997). The only published data using independently sampled genotypes reared in a common 
environment, that we are aware of, does not appear to support a gene of large effect (Bauer & 
Sokolowski 1984). To gather additional data about variation in larval foraging behavior, we have 
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assayed a set of 36 inbred lines derived from flies collected in the Raleigh, North Carolina 
farmer's market. Distribution of larval foraging behavior among these lines is not bimodal, nor is 
trait variation associated with genetic variation at or near the foraging locus. 

Using standard analyses from molecular population genetics we sought to detect any 
signature of balancing selection that may maintain two (or more) alleles at for in nature as has 
been suggested in the literature (Fitzpatrick et al. 2007). None of our analyses suggest that for is 
under strong balancing selection, nor do they suggest that patterns of variation at for are 
somehow unusual within the D. melanogaster genome. Given the very large sample of genomes 
in this analysis, we have no reason to believe that our statistical tests should be in any way 
underpowered. Thus our failure to reject the null model of neutrality for both patterns of 
polymorphism and divergence as well as the site frequency spectrum suggests that strong 
balancing selection at for is unlikely. 

These negative results have several major caveats. One possibility is that the trait we 
measured is not a high-fidelity replication of the "foraging behavior" measured in previous 
publications. It is notable that, though the for R strain had a path length more than twice as long as 
for in our data, the lengths of all paths are shorter than reported for these strains previously, 
despite careful replication of the phenotyping protocols published earlier. If subtle differences in 
larval condition or assay environment are crucial and different, the effect of for might fail to be 
expressed. Furthermore, we only assessed larval foraging distance in the presence of food. This 
is consistent with and directly comparable to most published data, including both published 
bimodal distributions. It seems possible that foraging variation would be more apparent if the 
difference in path length with and without food was assayed — especially if there were other 
genetic variation affecting overall motility. In contrast to the isofemale lines and outbred 
populations described previously, the RAL lines used here are inbred, and any recessive 
variation in overall performance could be a more substantial confounder. We also note that there 
could be genetic variants at the for locus that are not annotated in the RAL genome alignments. 
Though large-effect mutations have been found in nature for traits like pigmentation and 
insecticide resistance, it is still unclear when and why such large effects should be expected 
(Stern & Orgogozo 2008; Rockman 2012), and we hope that our negative results will motivate 
further work in this area. 

Finally, despite the overall negative result, we report that larval path length is putatively 
associated with genetic variants at several loci other than for. Though the genes underlying these 
associations are intriguing, these associations should be considered preliminary until further 
replication is conducted. Attempting a genome-wide association study with only 35 lines is an 
ambitious scheme (Long & Langley 1999), and one that is unlikely to be successful unless trait 
variation is due primarily to a single gene (as previously postulated for larval foraging). Despite 
this fact, it is encouraging that there is a slight increase in low p-values compared to a null 
expectation of a uniform distribution from zero to one. This null expectation could be violated 
for reasons other than true positives, such as cryptic population structure or other violations of 
the assumptions of the technique. The four most significant associations are all in or near genes 
that could plausibly affect larval motility, and we hope this result will motivate further work on 
this reference collection of genotypes. 



ACKNOWLEDGEMENTS 



Downloaded from http://biorxiv.org/on September 18, 2014 



We are grateful to the Sokoloswki lab and the Bloomington Drosophila Stock Center for 
providing stocks, and to the Mackay lab for constructing the RAL collection. Funding was 
provided by the National Institutes of Health (R01 GM098614 to TLT) and by support for to the 
Center for Scientific Computing at UCSB from the US National Science Foundation (NSF 
MRSEC DMR-1 121053 and NSF CNS-0960316). ADK was funded in part by NSF MCB- 
1052148 and by DOE/USDA 124336. DRS was supported by the National Institutes of Health 
under Ruth L. Kirschstein National Research Service Award F32 GM 105231. 

REFERENCES 

Adams, MD, Celniker SE, Holt RA, Evans CA, Gocayne JD 2000. The genome sequence of 
Drosophila melanogaster. Science 287: 2185-2195. 

Ayroles, JF, Carbone MA, Stone EA, Jordan KW, Lyman RF et al. (2009) Systems genetics of 
complex traits in Drosophila melanogaster. Nat. Genet. 41: 299-307. 

Bauer SJ, Sokolowski MB (1984) Larval foraging behavior in isofemale lines of Drosophila 
melanogaster and D. pseudoobscura. J. of Heredity 75: 131-134. 

Bauer SJ, Sokolowski MB (1985) A genetic analysis of path length and pupation height in a 
natural population of Drosophila melanogaster. Genome 27(3): 334-340. 

Begun DJ, Holloway AK, Stevens K, Hillier LW, Poh YP, et al. (2007) Population genomics: 
whole-genome analysis of polymorphism and divergence in Drosophila simulans. Plos Biology 
5(11): e310. 

Ben-Shahar Y, Leung H-T, Pak WT, Sokolowski MB, Robinson GE (2003) cGMP-dependent 
changes in phototaxis: a possible role for the foraging gene in honey bee division of labor. J Exp 
Biol 206, 2507-2515. 

Chintapalli VR, Wang J, Dow JAT (2007). Using Fly Atlas to identify better Drosophila models 
of human disease. Nature Genetics 39: 715-720 

de Belle JS, Sokolowski MB (1987) Heredity of rover/sitter: Alternative foraging strategies of 
Drosophila melanogaster larvae. 59: 73-83. 

de Belle JS, Hilliker AJ, Sokolowski MB (1989) Genetic localization of foraging (for): A major 
gene for larval behavior in Drosophila melanogaster. Genetics 123: 157-163. 

de Belle JS, Sokolowski MB, Hilliker AJ (1993) Genetic analysis of the foraging microregion of 
Drosophila melanogaster. Genome. 36, 94-101. 

Besson MT, Sinakevitch I, Melon C, Iche-Torres M, Birman S (2011). Involvement of the 
Drosophila taurine/aspartate transporter dEAAT2 in selective olfactory and gustatory 
perceptions. J. Comp. Neurol. 519(14): 273-257. 



Downloaded from http://biorxiv.org/on September 18, 2014 



de Bono M, Bargmann C (1998) Natural variation in a neuropeptide Y receptor homolog 
modifies social behavior and food response in C. elegans. Cell 94(5): 679-689 

Donlea J, Leahy A, Thimgan MS, Suzuki Y, Hughson BN, Sokolowski MB, Shaw PJ (2012) 
foraging alters resilience/vulnerability to sleep disruption and starvation in Drosophila. Proc. 
Natl. Acad. Sci. U.S.A. 109(7): 2613-2618. 

Chen A, Kramer EF, Purpura L, Krill JL, Zars T, Dawson-Scully K (201 1). The influence of 
natural variation at the foraging gene on thermotolerance in adult Drosophila in a narrow 
temperature range. J. Comp. Physiol. A, Neuroethol. Sens. Neural. Behav. Physiol. 197(12): 
1113-1118. 

Edelsparre AH, Vesterberg A, Lim JH, Anwari M, Fitzpatrick MJ. 2014. Alleles underlying 
larval foraging behaviour influence adult dispersal in nature. Ecology Letters, pre-published 
online Jan 6, 2014. DOI: 10.1 1 1 1/ele. 12234 

Ephrussi B, Herold JL (1945) Studies of eye pigments of Drosophila. Genetics 30: 62—83 

Fitzpatrick MJ, Feder E, Rowe L, Sokolowski MB (2007) Maintaining a behaviour 
polymorphism by frequency-dependent selection on a single gene. Nature. 447: 210-212. 

Foucaud, J, Philippe AS, Moreno C, Mery F (2013) A genetic polymorphism affecting reliance 
on personal versus public information in a spatial learning task in Drosophila melanogaster . 
Proc. Biol. Sci. 280(1760): 20130588 

Gioia A, Zars T (2009). Thermotolerance and place memory in adult Drosophila are independent 
of natural variation at the foraging locus. Journal of Comparative Physiology 195: 777-782. 

Graf SA, Sokolowski MB (1989) Rover/sitter Drosophila melanogaster larval foraging 
polymorphism as a function of larval development, food-patch quality, and starvation. Journal of 
Insect Behavior 2(3) 301-313. 

Griffith OL, Montgomery SB, Bernier B, Bernier B, Chu B, Kasaian K, et al. (2008) ORegAnno: 
an open- access community-driven resource for regulatory annotation. Nucleic Acids Res 36: 
D107-D113. 

Hartigan, JA, Hartigan PM (1985) The dip test of unimodality. Annals of statistics 13(1): 70-84. 

Hodge JJL, Choi JC, O'Kane CJ, Griffith LC (2005). Shaw potassium channel genes in 
Drosophila. J. Neurobiol. 63(3): 235-254 

Hudson RR, Kreitan M, Aguade M (1987) A test of neutral molecular evolution based on 
nucleotide data. Genetics 116(1): 153-159. 

Ioannidis JPA (2005) Why most published research findings are false. PLoS Med 2(8): el24. 



Downloaded from http://biorxiv.org/on September 18, 2014 



Jan LY, Jan YN (1976) L Glutamate as an excitatory transmitter at the Drosophila larval 
neuromuscular junction. Journal of Physiology. 262(1): 215-236 

Kamimura K, Ueno K, Nakagawa J, Hamada R, Saitoe M, Maeda N (2013) Perlecan regulates 
bidirectional Wnt signaling at the Drosophila neuromuscular junction. Journal of Cell Biology 
200(2): 219-233. 

Kaun KR, Chakaborty-Chatterjee M, Sokolowski MB (2008). Natural variation in plasticity of 
glucose homeostasis and food intake. J. Exp. Bio. 211: 3160-3166. 

Kaun KR, Hendel T, Gerber B, Sokolowski MB (2007) Natural variation in Drosophila larval 
reward learning and memory due to a cGMP-dependent protein kinase. Learn. Mem. 14(5): 342- 
349. 

Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D (2002) The 
human genome browser at UCSC. Genome Res. 12(6):996-1006. 

Kent CF, Daskalchuk T, Cook L, Sokolowski MB, Greenspan RJ (2009) The Drosophila 
foraging gene mediates adult plasticity and gene-environment interactions in behaviour, 
metabolites, and gene expression in response to food deprivation. PLoS Genet 5(8): el000609. 
doi:10.1371/journal.pgen.l000609 

Lawson, LP, Vander Meer RK, Shoemaker D (2012) Male reproductive fitness and queen 
polyandry are linked to variation in the supergene Gp-9 in the fire ant Solenopsis invicta. Proc. 
R. Soc. B. 279(1741): 3217-3222. 

Long AD, Langley CH (1999) The power of association studies to detect the contribution of 
candidate genetic loci to variation in complex traits. Genome Research 9: 720-731. 

Mackay, TFC, Richards S, Stone EA, Barbadilla A, Ayroles JF, et al. (2012) The Drosophila 
melanogaster genetic reference panel. Nature 482: 173-178. 

McDonald J, Kreitman M (1991) Adaptive protein evolution at the Adh locus in Drosophila. 
Nature 351: 652-654. 

MacPherson MR, Broderick KE, Graham S, Day JP, Houslay MD, Dow JA, Davies SA (2004) 
The dg2 (for) gene confers a renal phenotype in Drosophila by modulation of cGMP-specific 
phosphodiesterase. J. Exp. Biol. 207(16): 2769-2776. 

McGrath PT, Rockman MV, Zimmer M, Jang H, Macosko EZ, Kruglyak L, Bargmann CI 
(2009) Quantitative mapping of a digenic behavioral trait implicates globin variation in C. 
elegans sensory behaviors. Neuron 61(5): 692-699. 

Mery F, Belay AT, So AK, Sokolowski MB, Kawecki TJ (2007) Natural polymorphism 
affecting learning and memory in Drosophila. Proc. Natl. Acad. Sci. U.S.A. 104(32): 13051- 
13055. 



Downloaded from http://biorxiv.org/on September 18, 2014 



Montgomery S, Griffith OL, Sleumer MC, Bergman CM, Bilenky M, Pleasance E, Prychyna Y, 
Zhang X, Jones SJ (2006) ORegAnno: an open access database and curation system for 
literature-derived promoters, transcription factor binding sites and regulatory variation. 
Bioinformatics 22: 637-640. 

Nagle KJ, Bell WJ (1987) Genetic control of the search tactic of Drosophila melanogaster. An 
ethometric analysis of rover/sitter traits in adult flies. Behavior Genetics 17(4), 385-408. 

Osborne KA, Robichon A, Burgess E, Butland S, Shaw RA, Coulthard A, Pereira HS, Greenspan 
RJ, Sokolowski MB (1997) Natural behavior polymorphism due to a cGMP-dependent protein 
kinase of Drosophila. Science 277(5327): 834-836 

Pereira HS, Sokolowski MB (1993) Mutations in the larval foraging gene affect adult 
locomotory behavior after feeding in Drosophila melanogaster. PNAS 90:5044-5046. 

Rand DM, Kahn LM (1996) Excess amino acid polymorphism in mitochondrial DNA: contrasts 
among genes from Drosophila, mice, and humans. Mol. Bio. Evol. 13(6): 735-748. 

Reaume CJ, Sokolowski MB, Mery F (201 1). A natural genetic polymorphism affects retroactive 
interference in Drosophila melanogaster . Proc. Biol. Sci. 278(1702): 91-98. 

Ren Y, Kirkpatrick CA, Rawson JM, Sun M, Selleck SB (2003) Cell type-specific requirements 
for heparan sulfate biosynthesis at the Drosophila neuromuscular junction: Effects on synapse 
function, membrane trafficking, and mitochondrial localization. The Journal of Neuroscience. 
29(26): 8539-8550. 

Renger JJ, Yao W-D, Sokolowski MB, Wu C-F (1999) Neuronal polymorphism among natural 
alleles of a cGMP-dependent kinase gent, foraging, in Drosophila. J of Neuroscience 19: RC28 

Rockman MV, Kruglyak L (2009) Recombinational landscape and population genomics of 
Caenorhabditis elegans. PLoS Genet 5(3): el000419. 

Rockman, MV (2012) The QTN program and the alleles that matter for evolution: All that's gold 
does not glitter. Evolution 66(1): 1-17. 

Service, PM (2004) How good are quantitative complementation tests? Sci. Aging Knowl. 
Environ. 12: 13. 

Sokolowski MB (1980) Foraging strategies of Drosophila melanogaster. A chromosomal 
analysis. Behavior Genetics. 10:3, 291-302 

Sokolowski MB (1982) Rover and sitter larval foraging patterns in a natural population of D. 
melanogaster. Dros. Inform. Serv. 58:138-139. 



Downloaded from http://biorxiv.org/on September 18, 2014 



Sokolowski MB, Hansell RIC (1983) Drosophila larval foraging behavior. I. The sibling 
species, D. melanog aster and D. simulans. Behav. Genet. 13:159-168. 

Sokolowski MB, Hansell RIC, Rotin D (1983) Drosophila larval foraging behavior. II. Selection 
in the sibling species, D. melanogaster andZ). simulans. Behav. Genet. 13(2): 169 '-177 '. 

Sokolowski MB, Bauer SJ, Wai-Ping V, Rodriguez L, Wong JL, Kent C (1986) Ecological 
genetics and behaviour of Drosophila melanogaster larvae in nature. Animal Behavior 34, 403- 
408. 

Sokolowski MB, Pereira HS, Hughes K (1997) Evolution of foraging behavior in Drosophila by 
density-dependent selection. PNAS 94: 7373-7377 

Sokolowski MB (2001) Drosophila: Genetics meets behaviour. Nature Reviews Genetics 2, 879- 
890. 

Stern C (1926) Eine kreuzungsanalyse von korperfarbungen von Drosophila melanogaster 
verbunden mit drei neuen allelomorphen des faktors 'ebenholz'. Z. indukt. Abstamm.- u. 
VererbLehre41: 198-215. 

Stern DL, Orgogozo V (2008) The loci of evolution: How predictable is genetic evolution? 
Evolution 62(9): 2155-2177. 

Tajima, F (1983) Evolutionary relationship of DNA-Sequences in finite populations. Genetics 
105(2): 437-460. 

Tajima, F (1989) Statistical method for testing the neutral mutation hypothesis by DNA 
polymorphism. Genetics 123(3): 585-595. 

Thomas JW, Caceres M, Lowman JJ, Morehouse CB, Short ME, Baldwin ML, Maney DL, 
Martin CL (2008) The chromosomal polymorphism linked to variation in social behavior in the 
white-throated sparrow (Zonotrichia albicollis) is a complex rearrangement and suppressor of 
recombination. Genetics 179(3): 1455-1468. 



Downloaded from http://biorxiv.org/on September 18, 2014 



Figure 1. Variation in larval foraging behavior. Boxplot of foraging path lengths for 36 inbred 
RAL strains (not individually labeled, for clarity) and the reference rover (r), sitter (s) and sitter 2 
(s2) genotypes. The median for each line is shown, surrounded by a box (1st to 3rd quartiles), 
whiskers (range excepting outliers), and circles (outliers); sample sizes per line vary from 28 to 
81 (median n=54). 
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Figure 2. Genome-wide association study results. A Manhattan plot of ^-values for all variants 
is shown. Chromosome arms are shown in alternating colors, and the four most significant 
variants are circled in red. 
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Figure 3. The most significant association in the genome. A G/A polymorphism on the left 
arm of chromosome 3 at base pair 6,537,335 was the most significant in the genome. Shown are 
the phenotypes of the 5 lines with a G and the 3 1 lines with an A. Associations were done with 
log transformed data, but non transformed data are shown for clarity. 
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Figure 4. Polymorphism and divergence at the for locus. An image from the UCSC Genome 
Browser (Kent et al. 2002) showing the various isoforms of for and it's flanking regions. The 
ratio of nucleotide diversity to polarized divergence on the D. melanogaster lineage is shown in 
1 kb windows sliding every 100 bp, labeled as piOverDiv. Regulatory elements from ORegAnno 
(Montgomery et al. 2006; Griffith et al. 2008) are also shown, as are repetitive elements from 
RepeatMasker ( http://www.repeatmasker.org ). 
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Table 1. Unpolarized McDonald-Kreitman test of for. 
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Figure SI. Genome-wide association study at foraging. A Manhattan plot of /^-values at the 
foraging locus are shown on the same scale as figure 2. 
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Figure S2. Distribution of /rvalues at foraging. The observed ^-values for the 607 SNPs at 
foraging are shown against an expected (uniform) expectation. 
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Figure S3. Distribution of /rvalues genome-wide. The observed ^-values for the 1,301,321 
SNPs genome-wide are shown against an expected (uniform) expectation. 




